06:39
2026-05-19
dev.to
artificial-intelligence
OpenAI Operator scores 43% on hard web tasks. We scored 81%. Here are all 300 runs.
TinyFish achieved an 81% success rate on the Mind2Web benchmark, significantly outperforming OpenAI Operator's 43% score on hard web tasks. The company tested its web agent across all 300 tasks on livβ¦